feat: scaffold for all annotations with reasonable structure#13
Merged
shloknatarajan merged 47 commits intoDaneshjouLab:mainfrom Jul 1, 2025
Merged
feat: scaffold for all annotations with reasonable structure#13shloknatarajan merged 47 commits intoDaneshjouLab:mainfrom
shloknatarajan merged 47 commits intoDaneshjouLab:mainfrom
Conversation
…ations - Implement module-level caching for get_true_variants() to avoid repeated JSON file loading - Fix type annotations across multiple files to use Optional[T] instead of T = None - Add comprehensive efficiency analysis report documenting all identified issues - Add test script to verify caching functionality works correctly This addresses the critical efficiency issue where JSON files were loaded on every function call, causing unnecessary disk I/O operations. The caching implementation uses lazy loading with proper error handling for missing files. Co-Authored-By: Shlok Natarajan <shlok.natarajan@gmail.com>
- Add DrugAnnotation and DrugAnnotationList models to src/variants.py - Create new drug_annotation_extraction.py component with detailed field extraction - Integrate drug annotation extraction into variant association pipeline - Add comprehensive test script for verification - Follow existing LLM infrastructure patterns (Generator/Parser) - Extract detailed pharmacogenomic fields matching provided schema Co-Authored-By: Shlok Natarajan <shlok.natarajan@gmail.com>
- Modified extract_drug_annotations to loop through variants one at a time - Each variant now gets individual LLM processing for better control - Added SingleDrugAnnotation model for individual variant processing - Updated logging to show individual variant processing progress - Maintains same output quality while providing cleaner extraction per variant - Updated test script to reflect individual processing approach Co-Authored-By: Shlok Natarajan <shlok.natarajan@gmail.com>
…ation-extraction Add drug annotation extraction component for variants with drug associations
…-improvements Efficiency improvements: Cache JSON loading and fix type annotations
…ents - Add PhenotypeAnnotation and FunctionalAnnotation data models to variants.py - Create phenotype_annotation_extraction.py with detailed extraction logic - Create functional_annotation_extraction.py with mechanistic annotation logic - Update variant_association_pipeline.py to integrate new extraction components - Follow existing drug annotation extraction patterns - Use detailed prompt templates from annotation_prompts.md - Process variants individually for better control and cleaner extraction - Include proper error handling and logging throughout Co-Authored-By: Shlok Natarajan <shlok.natarajan@gmail.com>
- Move test_imports.py and test_new_annotations.py to tests/ directory - Clean up temporary converted notebook file - Follow proper repository structure conventions Co-Authored-By: Shlok Natarajan <shlok.natarajan@gmail.com>
- Fix Python path resolution for tests running from tests/ subdirectory - Ensure tests can properly import src modules from new location - Verify all tests pass after organizational changes Co-Authored-By: Shlok Natarajan <shlok.natarajan@gmail.com>
…functional-extraction feat: implement phenotype and functional annotation extraction components
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Annotation pipeline works and saves outputs